Kevin Mader
13 March 2014
With inconsistent or every changing illumination it may not be possible to apply the same threshold to every image.
is easy using a threshold and size criteria (we know how big the cells should be)
is much more difficult because the small channels having radii on the same order of the pixel size are obscured by partial volume effects and noise.
Given that applying a threshold is such a common and signficant step, there have been many tools developed to automatically (unsupervised) perform it. A particularly important step in setups where images are rarely consistent such as outdoor imaging which has varying lighting (sun, clouds). The methods are based on several basic principles.
Just like we visually inspect a histogram an algorithm can examine the histogram and find local minimums between two peaks, maximum / minimum entropy and other factors
These look at the statistics of the thresheld image themselves (like entropy) to estimate the threshold
These search for a threshold which delivers the desired results in the final objects. For example if you know you have an image of cells and each cell is between 200-10000 pixels the algorithm runs thresholds until the objects are of the desired size
There are many methods and they can be complicated to implement yourself. FIJI offers many of them as built in functions so you can automatically try all of them on your image
While an incredibly useful tool, there are many potential pitfalls to these automated techniques.
These methods are very sensitive to the distribution of pixels in your image and may work really well on images with equal amounts of each phase but work horribly on images which have very high amounts of one phase compared to the others
These methods are sensitive to noise and a large noise content in the image can change statistics like entropy significantly.
These methods are inherently biased by the expectations you have. If you want to find objects between 200 and 1000 pixels you will, they just might not be anything meaningful.
Imaging science rarely represents the ideal world and will never be 100% perfect. At some point we need to write our master's thesis, defend, or publish a paper. These are approaches for more qualitative assessment we will later cover how to do this a bit more robustly with quantitative approaches
One approach is to try and simulate everything (including noise) as well as possible and to apply these techniques to many realizations of the same image and qualitatively keep track of how many of the results accurately identify your phase or not. Hint: >95% seems to convince most biologists
Apply the methods to each sample and keep track of which threshold was used for each one. Go back and apply each threshold to each sample in the image and keep track of how many of them are correct enough to be used for further study.
Come up with the worst-case scenario (noise, misalignment, etc) and assess how inacceptable the results are. Then try to estimate the quartiles range (75% - 25% of images).
For some images a single threshold does not work
Now we apply two important steps. The first is to remove the objects which are not cells (too small) using an opening operation.
The second step to keep the between pixels which are connected (by looking again at a neighborhood \( \mathcal{N} \)) to the air voxels and ignore the other ones. This goes back to our original supposition that the smaller structures are connected to the larger structures
As we briefly covered last time, many measurement techniques produce quite rich data.
We give as an initial parameter the number of groups we want to find and possible a criteria for removing groups that are too similar 1. Randomly create center points (groups) in vector space 1. Assigns group to data point by the “closest” center 1. Recalculate centers from mean point in each group 1. Go back to step 2 until the groups stop changing
What vector space to we have?
objColor=kmeans(indata,2);
Or just the orientation
objColor=kmeans(indata(:,[aisoCol,thCol]),2);
A more general approach is to use a probabilistic model to segmentation. We start with our image \( I(\vec{x}) \forall \vec{x}\in \mathbb{R}^N \) and we classify it into two phases \( \alpha \) and \( \beta \)
\[ P(\{\vec{x} , I(\vec{x})\} | \alpha) \propto P(\alpha) + P(f(\vec{x}) | \alpha)+ P(\sum_{x^{\prime} \in \mathcal{N}} f(\vec{x^{\prime}}) | \alpha) \]
Fuzzy classification based on Fuzzy logic and Fuzzy set theory and is a general catagory for multi-value logic instead of simply true and false and can be used to build IF and THEN statements from our probabilistic models.
\[ P(\{\vec{x} , I(\vec{x})\} | \alpha) \propto P(\alpha) + P(f(\vec{x}) | \alpha)+ \] \[ P(\sum_{x^{\prime} \in \mathcal{N}} f(\vec{x^{\prime}}) | \alpha) \]
which encompass aspects of filtering, thresholding, and morphological operations
Once we have a clearly segmented image, it is often helpful to identify the sub-components of this image. The easist method for identifying these subcomponents is called component labeling which again uses the neighborhood \( \mathcal{N} \) as a criterion for connectivity, resulting in pixels which are touching being part of the same object.
In general, the approach works well since usually when different regions are touching, they are related. It runs into issues when you have multiple regions which agglomerate together, for example a continuous pore network (1 object) or a cluster of touching cells
The more general formulation of the problem is for networks (roads, computers, social). Are the points start and finish connected?
We start out with the network and we grow Begin to its connections. In a brushfire-style algorithm
Same as for networks but the neighborhood is defined with a kernel (circular, full, line, etc) and labels must be generate for the image
Assign a unique label to each point in the image \[ L(x,y) = y*\text{Im}_{width}+x \]
For each point \( (x,y) \)
Repeat until no more \( L \) values are changed
The image very quickly converges and after 4 iterations the task is complete. For larger more complicated images with thousands of components this task can take longer, but there exist much more efficient algorithms for labeling components which alleviate this issue.
In reality the component labeling algorithms are usually implemented, but it is important to understand how they work and their relationship with neighborhood in order to interpret the results correctly.
labImg=bwlabel(imName)
Now all the voxels which are connected have the same label. We can then perform simple metrics like
Next week we will cover how more detailed analysis can be performed on these data.